[ci] Add surefire fork timeouts to prevent CI hangs#6186
Open
joewiz wants to merge 1 commit intoeXist-db:developfrom
Open
[ci] Add surefire fork timeouts to prevent CI hangs#6186joewiz wants to merge 1 commit intoeXist-db:developfrom
joewiz wants to merge 1 commit intoeXist-db:developfrom
Conversation
Configure forkedProcessTimeoutInSeconds=600 and forkedProcessExitTimeoutInSeconds=60 in both maven-surefire-plugin and maven-failsafe-plugin in exist-parent/pom.xml. This kills forked JVMs that hang (e.g. DeadlockIT, MoveResourceTest) after 10 minutes instead of waiting indefinitely for the 45-minute GitHub Actions step timeout. Also reduce the integration test step timeout from 45 to 30 minutes in ci-test.yml — with surefire killing hung forks at 10 minutes, 30 minutes is plenty for the full integration suite. Clean runs complete in ~3.5 minutes; the 600s timeout is a safety net that only fires on hung tests. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3 tasks
dizzzz
approved these changes
Mar 26, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
When a test like
DeadlockITorMoveResourceTesthangs during CI, the surefire/failsafe forked JVM waits indefinitely — the only protection is the GitHub Actions step timeout at 45 minutes. This burns CI minutes and blocks PR merges.This PR adds surefire fork timeouts so hung tests are killed after 10 minutes instead of 45.
What changed
exist-parent/pom.xmlAdded to both
maven-surefire-pluginandmaven-failsafe-pluginconfiguration:forkedProcessTimeoutInSeconds=600: Kills the forked JVM after 10 minutes. Clean test runs complete in ~3.5 minutes, so this only fires on hung tests.forkedProcessExitTimeoutInSeconds=60: Gives the fork 60 seconds to flush results before force-kill..github/workflows/ci-test.ymlReduced integration test step timeout from 45 to 30 minutes. With surefire killing hung forks at 10 minutes, 30 minutes provides ample buffer.
Evidence the fix works
The Windows CI run on this PR proves the timeout infrastructure works:
DeadlockITThe remaining BUILD FAILURE is a fork exit hang (BrokerPool/BlobStore shutdown delay) — the fork completed all tests successfully but didn't exit within the 60s
forkedProcessExitTimeoutInSeconds. This is addressed by companion PR #6183 (bounded BlobStore join() timeouts).Why these values
Our hang experiment (Round 3) showed:
Test plan
mvn test -pl exist-core— 6542 tests, 0 failures, BUILD SUCCESS (3:25)🤖 Generated with Claude Code